Abstract: The outcome of a One Day International (ODI) cricket match depends on various factors. This research aims to identify the factors which play a key role in predicting the outcome of an ODI cricket match and also determine the accuracy of the prediction made using the technique of data mining. In this analysis, statistical significance for various variables which could explain the outcome of an ODI cricket match are explored. Home field advantage, winning the toss, game plan (batting first or fielding first), match type (day or day & night), competing team, venue familiarity and season in which the match is played will be key features studied for the research . For purposes of model-building, three algorithms are adopted: Logistic Regression, Support Vector Machine and Naïve Bayes. Logistic regression is applied to data already obtained from previously played matches to identify which features individually or in combination with other features play a role in the prediction. SVM and Naïve Bayes Classifier are used for model training and predictive analysis. Graphical representation and confusion matrices are used to represent the various sets of models and comparative analysis is done on them. A bidding scenario is also considered to explain the decisions that can be taken after the model has been built. Effect of this decision on the cost and payoff of the model is also studied.
Keywords: Analytics, Cricket, Sports Prediction, Logistic Regression, Naïve Bayes, Support Vector Machine